Mining Labelled Tensors by Discovering both their Common and Discriminative Subspaces
نویسندگان
چکیده
Conventional non-negative tensor factorization (NTF) methods assume there is only one tensor that needs to be decomposed to low-rank factors. However, in practice data are usually generated from different time periods or by different class labels, which are represented by a sequence of multiple tensors associated with different labels. This raises the problem that when one needs to analyze and compare multiple tensors, existing NTF is unsuitable for discovering all potentially useful patterns: 1) if one factorizes each tensor separately, the common information shared by the tensors is lost in the factors, and 2) if one concatenates these tensors together and forms a larger tensor to factorize, the intrinsic discriminative subspaces that are unique to each tensor are not captured. The cause of such an issue is from the fact that conventional factorization methods handle data observations in an unsupervised way, which only considers features and not labels of the data. To tackle this problem, in this paper we design a novel factorization algorithm called CDNTF (common and discriminative subspace non-negative tensor factorization), which takes both features and class labels into account in the factorization process. CDNTF uses a set of labelled tensors as input and computes both their common and discriminative subspaces simultaneously as output. We design an iterative algorithm that solves the common and discriminative subspace factorization problem with a proof of convergence. Experiment results on solving graph classification problems demonstrate the power and the effectiveness of the subspaces discovered by our method.
منابع مشابه
Factorization of Multiple Tensors for Supervised Feature Extraction
Tensors are effective representations for complex and time-varying networks. The factorization of a tensor provides a high-quality low-rank compact basis for each dimension of the tensor, which facilitates the interpretation of important structures of the represented data. Many existing tensor factorization (TF) methods assume there is one tensor that needs to be decomposed to low-rank factors....
متن کاملA review of text mining approaches and their function in discovering and extracting a topic
Background and aim: Four text mining methods are examined and focused on understanding and identifying their properties and limitations in subject discovery. Methodology: The study is an analytical review of the literature of text mining and topic modeling. Findings: LSA could be used to classify specific and unique topics in documents that address only a single topic. The other three text min...
متن کاملAdapting K-Means Algorithm for Discovering Clusters in Subspaces
Subspace clustering is a challenging task in the field of data mining. Traditional distance measures fail to differentiate the furthest point from the nearest point in very high dimensional data space. To tackle the problem, we design minimal subspace distance which measures the similarity between two points in the subspace where they are nearest to each other. It can discover subspace clusters...
متن کاملDiscovering descriptive rules in relational dynamic graphs
Graph mining methods have become quite popular and a timely challenge is to discover dynamic properties in evolving graphs or networks. We consider the so-called relational dynamic oriented graphs that can be encoded as n-ary relations with n 3 and thus represented by Boolean tensors. Two dimensions are used to encode the graph adjacency matrices and at least one other denotes time. We design t...
متن کاملUniversal Dependency Analysis
Most data is multi-dimensional. Discovering whether any subset of dimensions, or subspaces, of such data is significantly correlated is a core task in data mining. To do so, we require a measure that quantifies how correlated a subspace is. For practical use, such a measure should be universal in the sense that it captures correlation in subspaces of any dimensionality and allows to meaningfull...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2013